fix(engine): reset thrashing state when user removes reconcile-paused annotation#218
Conversation
|
❌ Generated Files Verification Failed One or more generated files in this PR are out of sync:
Please regenerate the files locally and commit the changes. |
… annotation When a user removes the platform.kubevirt.io/reconcile-paused annotation to resume reconciliation, the in-memory ThrashingDetector state (consecutiveThrottles >= ThrashingThreshold) was never cleared. If the token bucket was still empty at that point (< 6 s after the pause was set), the very next throttled reconciliation immediately re-paused the resource, making it permanently stuck. Add Step 1.6 in ReconcileAsset: after the IsPaused guard returns false, check whether consecutiveThrottles has already reached the threshold. If so the operator previously set the annotation and the user just removed it, so reset the thrashing detector for that resource. Subsequent throttles now count from 0 and will not re-trigger a pause until a genuine new edit war accumulates enough consecutive throttles. Fixes: CNV-89796 Signed-off-by: Simone Tiraboschi <stirabos@redhat.com>
|
/lgtm |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tiraboschi The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
b2d9a36
into
openshift-virtualization:main
|
/cherry-pick release-4.22 |
|
@tiraboschi: new pull request created: #224 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
When a user removes the platform.kubevirt.io/reconcile-paused annotation to resume reconciliation, the in-memory ThrashingDetector state (consecutiveThrottles >= ThrashingThreshold) was never cleared.
If the token bucket was still empty at that point (< 6 s after the pause was set), the very next throttled reconciliation immediately re-paused the resource, making it permanently stuck.
Add Step 1.6 in ReconcileAsset: after the IsPaused guard returns false, check whether consecutiveThrottles has already reached the threshold. If so the operator previously set the annotation and the user just removed it, so reset the thrashing detector for that resource. Subsequent throttles now count from 0 and will not re-trigger a pause until a genuine new edit war accumulates enough consecutive throttles.
Fixes: CNV-89796